7-5 Frequency-domain: HPS

本節介紹在頻域上的音高追蹤的方法,包含 首先我們來看看 HPS,其示意圖如下:

其中在 down-sampling 的部分,我們將頻譜訊號進行向下取樣,r=1 代表原訊號,r=2代表隔一點抓一點(訊號長度只有原先的 1/2),r=3代表隔兩點抓一點(訊號長度只有原先的 1/3),依此類推,得到各個「壓縮」的版本,再將這些「壓縮」的訊號加起來,示意圖如下:

由於每一個壓縮後的訊號都會在基頻附近有一個高點,所以累加的結果,就會凸顯這個高點,比較容易看出基頻的位置。

以蘇豐文老師的歌聲「soo.wav」為例,如果從第 15000 點開始抓一個 32-ms 的音框(長度為 353 點),以此音框來進行 HPS,結果如下:

Example 1: frame2hps01.mwaveFile = 'soo.wav'; au=myAudioRead(waveFile); startIndex=15000; frameSize=round(32*au.fs/1000); % 32ms endIndex=startIndex+frameSize-1; frame = au.signal(startIndex:endIndex); zeroPaddedFactor=15; [hps, freq, spec0, spec1, spec2]=frame2hps(frame, au.fs, zeroPaddedFactor, 1); Pitch = 54.222833 semitone

In the above plot:
  1. Subplot 1 is the waveform of the frame.
  2. Subplot 2 shows the power spectrum and its trend estimated by a 20-order polynomial.
  3. Subplot 3 shows the trend-subtracted power spectrum and its tapering version.
  4. Subplot 4 shows the components of HPS, which are of down-sampled versions of tapering power spectrum.
  5. Subplot 5 shows the HPS and its maximum within a reasonable range for human's pitch.

HPS 的特性說明如下:

The following example uses HPS for pitch tracking:

Example 2: ptByHps01.mwaveFile='soo.wav'; opt=pitchTrackBasic('defaultOpt'); opt.frame2pitchOpt.pdf='hps'; showPlot=1; pitch=pitchTrackBasic(waveFile, opt, showPlot);


Audio Signal Processing and Recognition (音訊處理與辨識)